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The role of the time-arrow in mean-square 
estimation of stochastic processes 

Yongxin Chen, Johan Karlsson, and Tryphon T. Georgiou 


Abstract —The purpose of this paper is to explain a certain dichotomy 
between the information that the past and future values of a multivariate 
stochastic process carry about the present. More specifically, vector¬ 
valued, second-order stochastic processes may be deterministic in one 
time-direction and not the other. This phenomenon, which is absent in 
scalar-valued processes, is deeply rooted in the geometry of the shift- 
operator. The exposition and the examples we discuss are based on the 
work of Douglas, Shapiro and Shields on cyclic vectors of the backward 
shift and relate to classical ideas going back to Wiener and Kolmogorov. 
We focus on rank-one stochastic processes for which we present a 
characterization of all regular processes that are deterministic in the 
reverse time-direction. The paper builds on examples and the goal is to 
provide pertinent insights to a control engineering audience. 

I. INTRODUCTION 

The variance of the error in one-step-ahead prediction of a scalar, 
second-order, stationary, discrete-time stochastic processe is given by 
a well-known formula due to G. Szegd jT) as the geometric mean 

expj^ J log(cE>(6'))d6l| (1) 

of its power spectral density 3>(6). Past and future of the process 
contain the same information about the present and the identical same 
formula provides the variance of the “postdiction” error when the 
present is estimated from future values. This is rather evident since 
(D contains no manifestation of the time arrow. There is no such 
formula for the covariance matrix of the prediction or the postiction 
error for multivariable processes. The closest to such a formula was 
given when Wiener and Masani expressed the determinant of the 
error covariance, herein denoted by fl, in terms of the determinent 
of the power spectrum, 

det(fl) = exp J log(det(d?(0)))d6<| . (2) 

In a subtle way, when det(fl) = 0, this formula leaves out the 
possibility of a dichotomy between past and future, and as it turns out, 
this is indeed the case; it is perfectly possible for a (multivariable) 
stationary Gaussian stochastic process to be purely deterministic in 
one time-direction and not in the other. 

Naturally, this issue has been duly noted in classical works in 
prediction theory where it has been pointed out that the information 
contained in the remote past and the information contained in the 
remote future may differ, see e.g., Section 4.5]. Thus, the main 
objective of the present work is to highlight and elucidate this 
phenomenon with examples that are intuitively clear to an engineering 
audience. 
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More broadly, the manifestation of the time-arrow in engineering 
and physics is hardly a new issue, yet it is one that is not very well 
understood. The paradox of the apparent directionality of physics 
originating in physical laws that are time-symmetric is a key conun¬ 
drum; Feynman states that there is a fundamental law which says, 
that "uxels only make wuxels and not vice versa,” but we have not 
found this yet. Thus, the time reversibility of physical models, as well 
as the lack, are of great interest, see, e.g., a, a, (6) In a similar 
vain, it is expected that the time-arrow will draw increasingly more 
attention in modeling of engineering systems as well. 

Turning to time-series, the possible ways in which the time-arrow 
is encoded in the statistics have also been studied in the physics 
literature as well, see e.g., m It is widely thought that the time- 
direction and “nonlinearities” are revealed by considering several- 
point correlations and higher order statistics. While this may be so 
at times, it is surprising to most that the time-arrow may be clearly 
discerned in second-order stationary processes as well, in that their 
predictability properties may dramatically differ depending on the 
time-direction. The reason that this observation is often missed (cf. 
m, El) may be due to the fact that it is exclusively a phenomenon 
of vector-valued processes. Below, we explain this point with an 
example of a vector-valued moving-average process constructed so 
that the prediction error differs substantially in the two time-directions 
(Section ing. A limit case for a stochastic process with infinite 
memory allows for the process to be deterministic in one of the 
two time-directions but not in the other - we highlight this with an 
example as well. 

Prediction theory of second-order processes overlaps with that of 
analytic functions on the unit disc and the shift operator. Thus, the 
exposition and technical results of the paper rely heavily on this 
connection and on the work of Douglas, Shapiro and Shields Gol who 
obtained a characterization of cyclic vectors of the “backward shift.” 
Our analysis and examples include processes generated by filters 
whose transfer functions are cyclic with respect to the backward shift, 
or in a time-symmetric situation, processes generated by suitable 
acausal filters that are predictable from the infinite remote past. 
Besides explaining the dichotomy between past and future, and on 
how this relates to factorizability of the power spectrum im, CH, 
we also study regular rank-one processes and explicitly characterize 
all such processes that are deterministic in the reverse-time direction. 

The present paper is structured as follows. In Section |n| we 
remind the reader about connections between second-order stochastic 
processes and analytic functions. We define cyclic vectors and recall 
key results from oni, QD, Qi. In Section|in]we present an example 
of a moving-average process and we compare the corresponding 
predictor and postdictor error covariances. In Section nYi we present 
an example that is non-deterministic in one time-direction but not so 
in the other. In Section |V] we characterize the rank-one stochastic 
processes that have this property. We conclude with a discussion on 
factorizability and decompositions for this class of processes. 

The notation used in this paper is now briefly defined. We let 
1/2 be the space of square-integrable functions on the unit circle T, 
H 2 be the Hardy space of functions in L 2 whose negative Fourier 
coefficients vanish, I 2 be the space of square summable sequences, 


2 


and let || • ||2 denote the norm in the respective spaces. The orthogonal 
complement of H2 in L2 is denoted by while we use 
as a short for ZH2. We use spanj.(a;fe) to denote the space of all 
finite linear combinations of elements in {xk}k, a set of random 
variables on a suitable probability space. We also use Xi + X2 — 
{a;i + a;2 : xi € Xi,X 2 € X 2 } for subspaces Xi and W2. Similarly 
Xj = {E7=i Xj : Xj e Xj, j = 1,..., n}. 

II. Preliminaries 

The forward shift ?7 is a linear operator on the Hardy space H2 
defined as Uf(z) = zf{z). We will often identify H2 and I2, since 
they are isometric, and thus write 

U : (ao, ai, 02,...) ^ ( 0 , ao, oi, • • 

The backward shift [/* is the adjoint operator of U. On H2 we have 
U* f{z) = {f{z) — f{ 0 ))/z, and accordingly we may write 

U* : (ao, ai, 02, ...)—>■ (oi, 02,03,...). 

Cyclic vectors of an operator A are those vectors / such that the 
closure of the span of {A^f : n > 0} is the whole space; when / is 
not a cyclic vector (non-cyclic), the closure of the span is a proper 
Al-invariant subspaceQ As is well known, / £ 1^2 is cyclic for U if 
and only if / is an outer function^ When this is not the case, / lies 
in some closed invariant subspace of U, that is, a subspace of the 
form ifiH2 for some inne^ function (p. An invariant subspace of U* 
is of the form (pH2)^■ Therefore / fails to be cyclic under U* if 
and only if it is orthogonal to one of the spaces tpH2 with ip inner. 
This is not a property that can be easily checked! A more transparent 
condition for failure to be cyclic with respect to U* is given by the 
following theorem of Douglas-Shapiro-Shields ca. 

Theorem 1: A necessary and sufficient condition that a function / 
in H2 be U* non-cyclic is that there exists a pair of inner functions 
p and tp such that 

i = — almost everywhere on T. 

/ 4 ’ 

There are several easy but quite surprising properties of U* cyclic 
functions, see Go). For instance, i) a function is U* cyclic if and only 
if its outer factor is, and ii) if / is U* cyclic and g is non-cyclic, 
then / -f g, fg and f/g are all cyclic as long as they are in JT2. 
Throughout the rest of this paper, “cyclic” means cyclic with respect 
to U* unless otherwise stated. 

In L2, we define Uf{z) = zf(z) and its inverse as U~^f{z) = 
z~^ f{z). The invariant subspaces of L2 for U and U~^ are slightly 
different to those of H2 and have been extensively studied in 03 . 
The following extension of Beurling-Lax theorem 03 will be needed 
in this paper. 

Proposition 2 : If a subspace M of L2 is invariant for U~^, while 
not for U, then it has the form M — qH^ for some unimodular 
function q. 

The connection and correspondence between function theory on 
the unit disc and discrete-time, stationary stochastic processes is 
well known and we will make extensive use of the various facts 
given in the concise reference 0 ] Chapter 10 ]. The basis of the 
connection is the standard Kolmogorov isomorphism between the 
linear space generated by a second-order stochastic process {xk \ k G 
Z} and functions on L2 (on the unit circle). Throughout, in this 


paper, we follow the mathematical convention where U (equivalently, 
multiplication by z or e*®) corresponds to unit time-delay for the 
corresponding process. Thus, in the usual slight abuse of notation, 
z : Xk 1-^ Xk-i is the “delay operator” which is opposite to the 
way this is most often used in signal processing literature^ 


III. Comparison of predictor/postdictor error eor a 

MOVING-AVERAGE PROCESS 

It is often suggested that for Gaussian stationary processes the time 
direction does not have an impact on the error variance (cf. ( 3 , ©). 
As noted earlier, this is not so for multivariable processes. We first 
illustrate this fact with the moving-average bivariate process defined 
as follows. Consider the difference equations. 


Xk = Wk + ceWk-i ( 3 a) 

Vk = Wk, ( 3 b) 


where a 7^ 0 and the process {wk | fc G Z} is taken to be 
Gaussian, zero-mean, unit-variance and white, i.e., E{'WkWk} = 1 , 
and E{wkWi} = 0 for k i, and consider the stochastic process 




Xk 

Vk 


We are interested in one-step ahead linear prediction^ Thus, we seek 
to minimize the (matrix) error-variance 


E{(Co - lo|past)(^0 - Colpast)*} 

in the positive-semidefinite sense. Here, Co|paat is a function of 
past measurements X-i, X-2, ■ ■ ■, and y-i,y-2, ■ ... Since wo -L 
X-t,y-i for ^ > 0 , the solution is easily seen to be 

- (£:::)=( ) 
with a corresponding (forward) error variance 

rnin ^0|past)(^0 Co|past) I ^ 1 / 

€o|past V / 

In the reverse time direction, since 

y/c+i ~ 

we can write the dynamics l| 3 ) as 

Xk = (Xk+l- yk+l)/ot +O^Wk-l 

yk = (xk+i - yk+i)/a. 


Similar to the above argument for the forward time-direction, W-i 
is orthogonal to future measurements a;i, X2,. ■ ■, and yi,y2,. ■ ■, and 
hence, given future values, the optimal estimator for xo,yo is 

f _ f ^0|future \ _ f ( 2-1 1 ^ 

4 o|future - j {x^-y,)/a ) 

with corresponding minimal (backward) error variance 

min ]E{(^0 ^0|future)(^0 ^0[future) } ■ 

^Ojfuture 

The prediction problem is clearly not symmetric with respect to time, 
yet det Gf = det Gb = 0 in agreement with the Wiener-Masani 
formula 13 



subspace X is A-invariant if AX C X. 

^Outer functions ai'e also known in the engineering literature as minimum- 
phase', for definition and properties see (m 

^Inner functions are known in the engineering literature as all-pass', for 
definition and properties see again [m 


"^In signal procession, z~^ is often used to denote delay and causal transfer 
functions of linear operators are analytic outside the unit disc; our convention 
is the opposite. 

^Without loss of generality we consider only prediction of xq , pQ since all 
processes in this paper ai'e stationaiy. 




3 


The above example is sufficient to underscore the dichotomy. The 
forward and reversed processes have similar realizations (cf. 0). 
Indeed, we can easily see that 

Xk = awk + Wk+i 
TJk = Wk+l, 

is a realization for the backward process, where Wk is a standard 
Gaussian white-noise process. The forward and backward realizations 
can be derived and correspond to the left and the right analytic factors 

^ ^ ^ + a, z) ( 4 ) 

of the power spectrum ^(a). It is possible to go one step further and 
construct examples where this factorization is not possible in one 
direction and, then, in a corresponding time-direction the process is 
completely deterministic. 



TV. A NON-REVERSIBLE STOCHASTIC PROCESS 


The following example presents a situation where the power 
spectrum does not admit one of the two analytic factorizations and 
the underlying process is completely deterministic in one of the time- 
directions and not in the other. The stochastic process we consider 
is generated by 


Xk 

Vk 


Wk 

Wk- 


oo .. 


l + £ 


Wk-l, 


The modeling filter 

OO ^ 

1=0 

for the Xk component has as impulse response the harmonic series. 
Interestingly, while this is not a stable system in an input-output 
sense, when driven by a white-noise process, it generates a well- 
defined stochastic process with finite variance since the harmonic 
series is square-summable. Further, the function g[z) is cyclic 1101 
and, as we will see, a direct consequence is that the process is 
completely deterministic in the backward time-direction. 

Since wo -L X-£, y-e for £ > 0 , the optimal predictor is given by 

Tolpast A _ / 'jrfjy-t 

Vo I past / \ 0 

with a corresponding (forward) error variance 

inf E{(^0 - Co|past)(Co — |o|past)*}=: = 

folpast 


1 1 
1 1 



In the reverse time-direction, we estimate xo, J/o given future 
observations, xe, ye, for £ > 0 . Since we ~ ye, this is the same 
as estimating xq, yo given 


k--Z ^ 

Xk '■= Xk 'y ^ ^ ^Wk—e, 


oo 

= E — 

14 - 


+ £ 


Wk-l, 


and yk, for k > 0 . Now, spanj,^j^{t/fe} is orthogonal to xq, yo, yi and 
spanj,^g{ife}, and hence the estimation problem above is equivalent 
to estimating xo,yo based on yi and Xk for fc > 0 . In fact, yi is 
not needed and as we will see, xo, yo can be predicted with arbitrary 


precision based only on Xk for fc > 0 . The relation between Xk and 
Wk for k all can be expressed as 


/ Tl \ 
X2 


\ ■■ 


( 


...\ 


1 1 1 
n n+1 n+2 


V : 


/ Wl \ 
Wo 
W-1 

V ^ / 


=; nw, ( 5 ) 


where denotes the (infinite) Hilbert matrix, or equivalently, the 
representation of a Hankel operator with symbol the harmonic series. 
Note that the (fc-f l)th row of 'H corresponds to the backward shifted 
input responses 

oo 

E fc I ^ I = for A: = 0 , 1 ,..., 

e=o 

and since g(2) is cyclic, we have spanf.^Q{U*^g(z)} = Ho- 
Combining this and the Kolmogorov isomorphism we deduce that 
any linear combination of w with finite norm can be approximated 
by a finite linear combination of x with arbitrarily small error. The 
infimum of the backward error variance is therefore 

. inf ^0|future) (^0 Co|future) } • 

Ifuture 


0 0 
0 0 


and the time series {^fc} is uniquely determined by the infinite future 

(cf. m, 0). 

A alternative path to arrive at the same conclusion and deduce that 
the backward prediction error variance is zero can be based on well- 
known properties of the Hilbert matrix in IS. The Hilbert matrix 
does not have a discrete spectrum QSl ; see im for an elementary 
proof. Therefore, its range is dense m. That is, for any vector r = 
(ri, ro, r_i,.. .)^ € h, there is a vector a = (ai, ao, - - -)^ € h, 
with a finite number of nonzero elements, making the difference 

II T Tt.ii 

||r —a H\\2 


arbitrarily small. Thus, any linear combination of elements Wk for 
k < 0 , namely r^w, can be approximated by a^x with arbitrary 
precision. Therefore, any element Xk,yk,Wk with fc G Z is either 
known or can be predicted with arbitrary precision. 


V. Characterization OF backward deterministic 

RANK-ONE PROCESSES 

An n-dimensional Gaussian stochastic process is regular if its 
spectrum admits a (right) analytic factorization (see e.g., QH), and 
hence may be represented in the form 


Cfc = E/ GeWk-e 
e=o 

where Wk is a white-noise process, and the sequence {Gfc}fc>o € I2 
(HI, 0 Rank-one regular processes are those where the white-noise 
process Wk may be taken as a scalar process. 

Building on the example from the previous section and using 
results from Qol, we characterize all the regular rank-one processes 
that are backward deterministic. We start by identifying a subclass of 
bivariate processes that contains the example from Section ITVl Below 
we take Ge = {ge, he)'^ G C^, that is ge, he G C. 
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Theorem 3 : Consider the stochastic processes 

oo 

Xk = '^giWk-i, (6a) 

£^0 

oo 

Vk = ^i^k-e (6b) 

i^o 

where g{z) = cyclic and h{z) = 7^ 0 is 

non-cyclic. Then the backward process is deterministic. 

To show this, we need the following lemma. 

Lemma 4 : If ii is a non-cyclic function, then there exists an inner 
function 'tp such that 

3 H 2 . 

Proof: See the Appendix. ■ 

Proof of Theorem Let the inner function ijj be selected 
according to Lemma | 4 ] so that apanf.yf){z~^iph) D H2 . The 
backward prediction error for xo is bounded by 

y^E(||a;o - ®o|futurelP) = \\ag + b*h\\2 

= g + b*h)\\2 

< \\PH2{'^a g)\\2 ( 7 ) 

+ \\'fb*h + Pfj^{ipa g)\\2 

where a and b are polynomials with a(0) = 1 and b( 0 ) = 0, 
corresponding to the predictor ®o|future = + beye). 

The inequality in 0 follows from the triangle inequality. Since g 
is cyclic, so is U*{g'ip), hence spanfig is dense in H2, and 
therefore the first term of 0 which equals 


Note that Theorem [3 follows as a special case, from the equivalence 
between (a) and (c) and by using the fact that gjg ^ J and h/h £ J 
(see Theorem [T}. 

In view of Theorem |3 we define backward deterministic processes 
generated by a set of functions as follows. 

Definition 6 : The functions g^^\g^^\ ..., g^^^ £ H2 are called 
backward deterministic if 


^spanj^>o{2: = L 2 . (8) 

As a corollary to Theorem | 5 ] we also get an analogous result for 
general vector-valued rank-one processes. 

Corollary 7 : The non-zero functions , p^^^, •.., p^"^ € H2 are 
backward deterministic if and only if 


g^g^ rf -7 
p{i) pO) ^ ^ 


for some j — 2 ,... ,n. 

Proof See the Appendix. 


( 9 ) 


VI. Concluding remarks 

While the moving-average process in Section HID admits a spectral 
factorization for the backward process 0, there is no such factor¬ 
ization for the non-reversible processes in Section |IV] and Section Ivl 
This may be viewed in the context of decompositions for stochastic 
processes that are Gaussian, zero mean, and stationary. Namely, the 
Hilbert space generated by any such process may be decomposed as 

H(a) = H_oc(efc) 

in terms of the remote past and the driving noise, namely. 


\\PH2{^a*g)\\2 = \\tpg + '^atU*\fig)\\2, 

can be made arbitrarily small by selecting the polynomial a properly. 
Since spani^yQ{z~'°'iph) D H2, the polynomial b can be selected so 
that the second term of 0 is arbitrarily small as well. 

A similar argument can be used to show that po can be estimated 
with arbitrarily small error, by considering polynomials with a(0) = 
0 and 6 ( 0 ) = 1 in 0 . This completes the proof. ■ 

Following the same lines as in the proof of Theorem one can 
in fact show that 

spSfe>o{«“‘'5} + = L 2 , 

and therefore, the backwards prediction error is zero as a result of the 
Kolmogorov isomorphism. In this case the input sequence Wk, for 
k £ X, may be reconstructed arbitrarily well from the future output, 
Xk,yk for fc > 0. 

As stated in Theorem □ Qol, a function p £ H2 is cyclic if and 
only if p/p belong to ff, the set of unimodular functions that are 
quotients of inner functions^ i.e., ff = {‘p/'*/’ : ‘FiV’ inner}. This 
result is central to our characterization of backward deterministic 
rank-one processes, and leads to our main result. 

Theorem 5 : Let g,h £ H2, then the following conditions are 
equivalent 

(a) The system is backward deterministic, 

(b) spanj,>o{2“'=p}= L2, 

(c) ghiigh) ^ J. 

Proof: See the Appendix. ■ 

^The set J" is dense in the set of all unimodular functions with respect to 
1/2-norm. See (m for more discussion on 


H 

— CX3 i^k) = ntgzspanj,<j{5fe}, and 

ll{wk) = sprij,gz{wfc}, 

and similarly in terms of the remote future and the Hilbert space 
generated in the backward direction by a driving noise 

H(C,=) = H+oo(6) ®il{wk) 

(see 0 Section 4 . 5 ] for details). The process is reversible if the 
remote past and the remote future coincide. Here we have considered 
concrete examples of a non-reversible processes where the remote 
past is trivial while the remote future spans the entire process. 

The essence of these examples (Section llVIVI and, cf. 0 ) is that 
the power spectrum of {^k}, being 

(£))(>'«• '*(')■) 

fails to have a co-analytic spectral factorization, or equivalently, the 
backward process is not regular OH. This can be shown using 
Theorem [T] (see also 0 ). It also fails to satisfy condition 3 of 
Theorem 2 in Ql. This absence of co-analytic factorization renders 
the backward process deterministic. 

It is quite apparent that the issues herein are quite technical in 
nature. Yet, they impact in significant ways the relevance of certain 
models for time-series. Indeed, existence of left and right analytic 
factorizations for the corresponding power spectra may fail and, in 
such cases, the dichotomy between past and future becomes central. 
On the other hand, when the power spectrum is coersive (nonsingular 
at every frequency), it admits both left and right analytic spectral 
factors and the issue become mute. But even so, the limiting case 
where the power spectrum seizes to be coersive and the dichotomy 
appears, requires further understanding from a practical standpoint. 
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In particular, it is of interest to understand the interplay between the 
scales and window lengths required to estimate the present from past 
and future values, respectively. Deeper insights on how prediction and 
postdiction in stochastic models relate to time directionality, causality, 
ergodicity and even mixing may also result in further exploration of 
these issues. 
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Appendix 

A. Proof of Lemma^ 

Since h is non-cyclic there exists an inner function ip such that 
span^yfyU*^h = {{pH2)H^, hence 

spTife>o C 

spanj.>o z~^h(pz C Hf. 

Since the left hand side is invariant with respect to z~^ it is on the 
form fiHf where f) is inner. From Beurling-Lax theorem [m 03 
it follows that 

spanj,>Q z~'‘hipz = 

spanj,>o z~^h%l} D spanj,^Q z~'^hfjp = H2 ■ 

B. Proof of main theorem (Theorem [Jl 

In order to prove the main theorem we will use the following 
lemmas. 

Lemma 8 : For any function g € H2, the subspace 
spanj,>g{2“*^p} is equal to qHf, where q = g/pouter and 
pouter is the outer part of g. 

Proof: Clearly M = spEnf,y,f){z~^g} C Z/2 is a invariant 
subspace for U~^ while not for U, so it has the form M — qHf for 
some unimodular function q and M = q(B z~^M. The function q is 
determined by the subspace up to a constant factor. We next compute 
one such q. Since q € M, we have that q = gf for some analytic 
function /. We claim that one feasible / is given by 


Conversely, if q G J, then 

~ — A^Pinner — “7 

g ^ 

for some inner functions ip, tp, and hence g is non-cyclic. ■ 

Lemma 10 : Let qi and q be unimodular functions, then qiHf C 
qHf if and only if pqi — q for some inner function p. 

Proof: The sufficiency follows from the fact that any g £ qi Hf 
is on the form qif for some / £ H2, and hence satisfies g — qfp £ 
qHf. To see the necessity, we note that qiHf C qHf implies 
qi £ qHf. It follows that qi = qp for some unimodular p £ H2, 
that is, (p is a inner function. This completes the proof. ■ 

Lemma 11 : Let qiHf and q2Hf be subspaces of L2 where qi 
and 52 are unimodular, then qiHf + q2Hf = L2 holds if and only 
if 51/52 ^ J- 

Proof: We use proof by contradiction. Assume first that 51/52 £ 
J, i.e., there exist inner functions tpi,f)2 such that 51/52 = '!/'2/t/i. 
Now, let 5 be the unimodular function 5 = qitpi = 521/2- Then by 
Lemma [Tol we have qjHf C qHf for j = 1 , 2 , and by linearity it 
follows that 

qiHf + q2Hf C qHf / L2. 

Note that qHf 7^ L2 holds since, e.g., qz ^ qHf. 

Conversely, assume that qiHf + q2Hf 7^ L2, then qiHf -f 
q2H2 Si L2 is an invariant subspace for U~^ while not for U 
(this follows since it contains an analytic function HD). As a 
consequence of this there is an unimodular function 5 such that 
qiHf + q2Hf = qHf. This implies that qjHf C qHf, for 
j = 1 , 2 . By Lemma [H there exists inner functions 1/1,1/2 such 
that 5 = 511/1 = 521/2, and hence 51/52 £ J- ■ 

Proof of Theorem | 5 } The equivalence between (a) and (b) 
follows directly from the Kolmogorov isomorphism. Using Lemma 
it follows that (b) is equivalent to 

qiHf + q2Hf ^ L2, (10) 

where 51 = p/pouter and 52 = h/houtei- By Lemma [TT] Equa¬ 
tion d holds if and only if 51/52 ^ ff- Since 51/52 = 
ghhinner/ighginuer), where Pinner,/tinner are the inner parts of p 
and h respectively, the equivalence with Theorem | 5 ] (c) follows. ■ 


/ — 1 / Pouter, 

where pouter is the outer factor of p. Note q ~ gf = p/pouter is 
indeed a unimodular function. To see M = qHf, it is enough to 
show 5 _L z~^M, which is equivalent to 5 _L 2“^p for all fc > 1 . 
This follows from 

{q, Z~^g) = (S'5/<7outer, 2"'') = (Pouter, Z~^) = 0 , fc > 1 , 


which completes the proof. ■ 

Lemma 9 : A function p £ H2 is non-cyclic if and only if 
q £ J, where 5 is a unimodular function satisfying qHf — 

spSj,>o{2"'“p}. 

Proof: Lemma implies that the unimodular function 5 is on 
the form 



5^outer 


for some constant |A| = 1 . A function p £ H2 is non-cyclic if and 
only if there exists a pair of inner functions p and t/ such that 

^ ^ almost everywhere on T. 

g t/ 


By combining these two facts, it follows that if p G H2 is non-cyclic, 
then 


gginneT 


^£J. 


C. Proof of Corollary [ 7 | 

If ^ for some j = 2 , ...,n, then from 

Theorem | 5 ] we know spanj,^Q{2“*^p} -|- spanj,^g{2;“*^/i} = L2, 
which implies that X^/Li = L2. 

Conversely, suppose g^^^g^^^£ J for all p = 2 ,. .., n, 
then there exists inner functions pj , ipj for each j such that 


which is equivalent to 



gU) 


■02’ 


<n 

gj 


^ j dinner 

02g^nLr ’ 


i = 2,..., n 


by Lemma [8] Here qi are unimodular functions such that 
spanj,>g{2;“*^p^*/} = qiHf for all z = l,...,n. Now let 5 = 
giPimer nj=2 02 ’ 5 is unimodulaT function and satisfies that 

q/qi is inner for each z = 1 ,..., n, which implies qtHf C qHf by 
Lemma [Tol As a result, we conclude that 


n n 

^spanj,>o{z"''p‘^*^} = '^qiHf C qHf / L2, 

i=l i=l 


which leads to a contradiction. This completes the proof. 
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